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Abstract 

In Bayesian networks, exact belief propagation is achieved through message pass- 
ing algorithms. These algorithms (ex: inward and outward) provide only a recur- 
sive definition of the corresponding messages. In contrast, when working on hidden 
Markov models and variants, one classically first defines explicitly these messages 
(forward and backward quantities), and then derive all results and algorithms. In this 
paper, we generalize the hidden Markov model approach by introducing an explicit 
definition of the messages in Bayesian networks, from which we derive all the rele- 
vant properties and results including the recursive algorithms that allow to compute 
these messages. Two didactic examples (the precipitation hidden Markov model and 
the pedigree Bayesian network) are considered along the paper to illustrate the new 
formalism and standalone R source code is provided in the appendix. 



1 Introduction 



Probabilistic graphical models (PGMs) are powerful and versatile tools to study com- 
plex random systems with many variables ( Cowell et al.[ 1999[|Jensen and Nielsen[ 2007 



Koller and Friedman, 2009). Causal PGMs are called Bayesian Networks (BNTs) and 



can be seen as a generalization of Markov models like Markov chains, Hidden Markov 



Models (HMMs), or Markov trees (Smyth et al. 1997 1. For these models, exact inference 
usually involves the so-called forward and backward quantities which can be use to obtain 
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marginal or conditional distributions. From the definition of these quantities one can de- 



rive recursive formulas that allow to obtain them through linear algorithms (Durbin et al. 



1998). 



In the case of BNTs, the same tasks is conducted through the exact Belief Propagation 
(BP) first introduced by Pearl ( 1986[ 1988) for singly connected graphs and then general- 
ized to multiply connected graphs by a serie of articles ( |Lauritzen and Spi egelhalter, 1988; 
Shaf er and Shenoy[|l 99Cty | Jensen et al.[|1990a|b[ ). Although many variants exist ( Lepar and 



Shenoy, 1998; Schmidt and Shenoy, 1998), the principle of exact BP is always basically 



the same: 1) compute the so-called messages through a recursive algorithm, 2) then com- 



bine them to obtain marginal or conditional distributions. As pointed out by Smyth et al. 



( 1997 ), these messages corresponds in fact exactly to the forward and backward quantities 
in the particular case of HMMs. However, there is a noticeable difference: in HMMs, 
messages are first defined explicitly and then used to derive results and algorithms, while 
with exact BP, messages are implicitly defined as the results of the recursion algorithms. 

The objective of the present work is to push a step forward the parallel between HMMs 
and BNTs by introducing a new formalism where we first give an explicit sense to the 
messages from which all results, recursions, and algorithms can then be derived. 

The paper is organized as follows: in Section|2]we first consider a simple HMM exam- 
ple (the precipitation HMM) that will illustrate the message orientated approach of these 
models. In Section [3] we do some recalls on BNTs, the notion evidence, and junction tree. 
We also introduce a small but illustrative BNT example (the pedigree BNT). Finally in 
Section [4] we present our new results: the explicit definition of the message functions and 
how the classical results and algorithms derive from this definition. All results are illus- 
trated both with the precipitation HMM and the pedigree BNT and standalone R source 
code is provided in the appendix. We end by discussing the possible advantages of this 
new approach. 



2 Precipitation HMM 

Let us assume that we observe daily the mm of precipitation at a given location. These 
measurements obviously depend on the atmospheric conditions. For simplification pur- 



Table 1 : Distribution of Y, conditionally to Sj in the precipitation HMM. 
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pose, we consider only two possible atmospheric conditions: low pressure (denoted L) 
and high pressure (denote H). For i = 1, . . . ,n, we denote by Y, the mm of precipitation 
observed at day i and by 5 ; the atmospheric conditions the same day, and we assume: 

i) Sl:n = (Si)i=i.... n is an homogeneous Markov chain starting with Si = H, and with 
transition probabilities given by P(5 f = L|5,_i = H) = 0.3, P(S ; - = H|Sj_i = L) = 0.1; 

ii) Yi;n = (Yi)i=i...n ls a independent sample of Poisson variables whose parameter only 
depends on Sf. E[y-|S f = L] = 3.0 and E[y-|5/ = H] = 0.5 (see Tab.[T])- 

We hence have: 

^Y 1:ni S 1:n ) = ^S 1 )P{Y l \S 1 )f\F{Si\S i - 1 )¥{Y i \Si) (1) 

If Y\ :n is observed while S\ :n is not, this results in a typical HMM where there is a 
trend to have more precipitations in period of low atmospheric pressure. Our objective 
is to study P(Si :B |Yi :n ) the distribution of the unobserved phenomenon (the atmospheric 
pressure) conditionally to the observations (the mm of precipitation). 

Following the classical approach to this problem (Durbin et al. , 1998j ), we first intro- 
duce the so called forward and backward quantities, respectively defined for all s G {L, H} 
and for i = 1 . . .n by: 

F i (s)=F(S i = S J hl ) and B t (s) = F(Y i+1:n \Si = s) (2) 

with the convention that B n (s) = 1. The critical point is then just to prove the following 
proposition: 

Proposition 1. For all i = \ ...n and for all r, s E {L, H} we have: 

F(S l =sJ v .n)=F 1 ( S )B l ( S ) (3) 

and 

P(S,-_i =r,Si = s,Y hn ) =F i - 1 (r)7C{r,s)e s (Y i )B i (s) (4) 
where n(r,s) = P(5,- = s\S t -i = r) and e s (k) = P(y f = k\S t = s). 

Proof. We prove only Eq. ([3]) since the argument is similar for Eq. ([4]). Thanks to Eq. ([T|) 
we first observe that: 

P(Sl:«,Fl:n)=P(5i: ! -,Fi :i )P(S , m:M ,F m:B |5i) 
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Figure 1: Observation of the precipitation HMM over n = 100 days. The (observed) mm 
of precipitation are given by the dots (scale on the left axis), the (hidden) status of states 
Si are given by the shape of the dots, and the posterior probability P(S; = L|Fi :n ) are given 
by the solid line (scale on the right axis). 



from which a simple marginalization gives us: 

F(Si = s,Y lM ) = £ Y t F(S U -i,S i = s,Y u )¥(S i+ i M ,Y i+ i M \Si=s) 

= £ F(S 1:i - U Si = s,Y l:i ) £ F{S i+1 . n ,Y i+hn \Si = s) 



Fi{s) Bi(s) 



□ 



From this proposition, we can easily establish all the classical results of HMM infer- 
ence. 



4 



Table 2: Five samples drawn from P(S40:6ol^i:re) using the data of Fig. [T] The reference 
value of Sj and the posterior marginal distribution P(S; = L|Fi :n ) are also given for i = 



40... 60. 
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Corollary 2 (forward and backward recursions). The forward quantities can be recursively 
computed from F\(s) = l s =Re s (Y\ ) for all? = 2 ... « with: 

F i {s)=^F l -i{r)n{r,s)e s {Y i ) (5) 

r 

and similarly, the backward quantities can be recursively computed from B„(s) = 1 for all 
i = n...2: 

B i ^ 1 (r) = Y d Jt(r jS )e s (Y i )B i ( S ). (6) 

s 

Proof. We only prove the forward recursion. We simply start from 

P(5 f = s,Y 1:n ) = £P(5 f _i = r,S t = s,Y lM ) 

r 

and apply Eq. ([3]) on the left-hand term, and Eq. Q on the right-hand term to obtain: 

Fi(s)Bi(s) = Y,F l -i(r)Tt{r,s)e s (Yi)B l {s) 

r 

which gives the forward recursion by simplifying by Bi(s). □ 

We can see on Fig. [T] and example of data produced by the model over n = 100 days. 
The posterior probability P(5 ; = L|Fi :n ) is quite consistent with the (unobserved) reference 
values of Sj. 

Corollary 3 (forward and backward sampling). The distribution of 5i :n conditionally to 
Y\ :n is an heterogeneous Markov chain whose transitions are given by 

P( , i = s | S , 1 = r ,y 1: „ ) = 5Mf£(^W (7) 



in the forward direction, and by 

F(S,.- 1 = ^ = , ) F 1:n) = ^ l(r) ;^^ (8) 
in the backward direction. 

Proof. We prove only the forward direction. We simply start from 

F(S i - 1 = r,Si = s J Y 1:n ) 



F{S i = s\Si- 1 = r,Y lyi ) 



F(Si-i=r,Yi M ) 

and use Eq. ([4]) on the numerator, and Eq. ([3]) on the denominator. □ 



For example, we can see on Tab. [2] some samples drawn from P(5i :n |Fi :n ) using the 
previous corollary. 

3 Recalls on Bayesian Networks 
3.1 Model 

Let X% = (X u ) ue y/, = {1,.. . ,/?} be a set of p discrete^ random variables such as, 
for all u G X u G % C R d " (d u G N*). Let & c ^ x <2f such that define a 

directed acyclic graph (DAG) over . For all v G ^ , we define the parent set of v as 

def 

pa(v) = {h6^, (m, v) G Then the distribution of X^ G is given by: 

Note that Eq. ([9]) defines a probability thanks to the acyclic property of graph 

Such a model is called a Bayesian network (BNT) due to the fact the distribution of Xc% is 

defined only through the conditional distributions P (X u \X pei ^) . 

Example 4. In the particular case of the precipitation HMM over n = 100 days we get the 
DAG of Fig. [2j If we denote the variable with = {-n, . . . , -1} U {1, . . . ,n} (p = In), 
then for all i = 1 . . .n we have X t = Si {% = {L, H}), = Y t = N). We hence get 
the following parent sets: pa(l) = 0, pa(z') = {z — 1} for i = 2...n, and pa(— z) = {z} for 
z = 1 . . . n. Note that replacing the generic variables by their values in Eq. (|9]) immediately 
gives Eq. ([T]). 



'it is possible to consider continuous variables as well (or even a mixture of discrete and continuous vari- 
ables) by replacing everywhere probabilities by densities, and sums by integrals. For the sake of simplicity, 
we here restrict ourselves to the pure discrete case. 
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Figure 2: DAG representing the precipitation HMM with n = 5. 

Example 5. We can see on Fig. [3] a slightly more complex BNT which represents the 
parental relationships (a pedigree) of 10 individuals. This BNT includes a loop (consan- 
guinity relationship between two cousins) but no orientated cycles. 
The distribution of Xi : io is hence given by 

P(Xi : io) = F(X l )F(X 2 )F(X 3 \X U2 )F(X 4 \X ia ) 

F(X 5 )F(X 6 )F(X 1 \X 3 ^F(X S \X 3j5 )F(X 9 \X 4j6 )F(X l0 \X 1 ^. 

For all i, Xj represents the genotype of individual i at a given disease locus. We consider 
that there is only two alleles: the disease allele D and the non disease allele d. Xj hence 
takes its value in the following set of genotypes: {dd, dD, DD} (note that genotypes dD and 
Dd are indistinguishable). 

For i G {1,2,5,6} (the founders set - individuals with no parents), we assume a 20% 
frequency for the disease allele in the general population and we get: F(Xj = dd) = 0.64, 
F(Xj = dD) = 0.32, and F(Xj = DD) = 0.04. For any other individual k, we denote by i 
and j its two parents, and according to the Mendelian transmission of alleles we get the 
following conditional distribution: 
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3.2 Evidence 

We introduce the notion of evidence by considering for all u G % a subset S£ u C 3i u of 

def 

possible outcomes for X u . For any f Cf, we define §y = {Xy G <^V}. Evidence 
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Figure 3: Pedigree BNT of 10 individuals with a consanguinity loop between Individual 7 
and Individual 9 (two cousins). 



def 

is then defined as the event £ = = {X<% E Empty evidence (or no evidence) 

corresponds to the unconstrained case where 3£ u = S) u for all u E % . Our aim is to study 
the conditional distribution 

nw) = do 

Example 6. In the particular case of the precipitation HMM, we denote by y\- M the ob- 
served precipitations. We then have for all / = 1 . . . n: 3£[ = *2>i = {L, H} (no evidence), and 
= {yd- We hence have g = {Y l:n = y hn } and F{X^\£) = F{S hn \Y hn = y hn ). 

Example 7. For the pedigree BNT, we assume that our disease locus is connected to 
a recessive disease. If a given individual i is affected by the disease we have Xt = DD, 
if he is not affected we get Xi E {dd,dD}. Assuming that individuals 8, 9 and 10 are 
affected, that individual 7 is not affected, and that we do not know the disease status of the 
remaining individuals, we get the following evidence: £ = {X\t,,5,6,9 e {dd,dD,DD},X7 E 
{dd,dD},X 2A8 ,10 = DD}. 
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Figure 4: JT for the precipitation HMM with n = 5. The star * indicates the cluster to 
which is associated each variable. 

3.3 Junction Tree 

We consider Cjr = (Ci) ie j?, J* = { 1 , . . . , q] a set of q clusters such as Q C ^ for all i G y 
and we assume the following three conditions: 

JT1) Tree. We have a tree structure on Cj: for any i, j G y it exists a unique connecting 
path, denoted path(?', j), between Q and Cj. 

JT2) Running intersection. For any i, j G J^, C, fi Cj C Q for all G path(z, j) . 

def 

JT3) Covering. For any w G 1^, it exists at least one i e such as the family set fa(«) = 
pa(w) U {m} C C;-. 

Such a cluster tree is called a junction tree (JT) associated to the BNT Note that the 
tree composed by a single cluster C\ = y is always a junction tree, thus proving the 
existence of such object. However, finding a JT minimizing some criterion (typically the 
cardinal of the largest cluster) is known to be a NP-hard problem in general ( Arnborg et al.[ 



1987b. Fortunately, it exists several heuristics that can build "reasonable", but possibly 



suboptimal, JTs ( |Jensen and Jensen[ |1994| [Becker and Geiger[ [T996; Sh oiket and Ge iger 



1997) 



We assign for all u G a cluster cl(w) G J ', such that fa(w) G cl(w). In the case that 
there are more than one cluster that fulfill this condition, we arbitrarily select one among 
them. Note that the condition (JT3) guarantees the existence of at least one possibility. 

Example 8. In the particular case of the precipitation HMM we can build the simple 
JT which is a chained sequence of n clusters: C\ = {1,-1} and C ; = {i— — i} for 
i = 2. ..n. In order to improve readability from now on we will use the original name of 
the variables rather than its index u (ex: 54 instead of 4, X2 instead of —2) whenever the 
notation is not ambiguous. We can therefore write C\ = {S\,Y\} and Q = {S,_i,S;, Y,} 
for i — 2 . . . n (see of Fig. |4]for an example with n = 5). The resulting structure obviously 
fulfills the three JT conditions. For i = 1 . . .n, variables Yj and 5, are assigned to cluster C ; . 
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Figure 5: JT for the pedigree BNT. The star * indicates the cluster to which is associated 
each variable. 

Example 9. We can see in Fig. [5J a JT associated to the pedigree BNT of Fig. [3} Condi- 
tions JT1 (tree) and JT3 (covering) are clearly respected. This is also true of JT2 (running 
intersection) even if it is less obvious. For an illustrative purpose, let us verify JT2 in 
two particular cases: 1) C\ HCj = {X3} which means that C2 and C4 must contain Xy, 2) 
C3 H C5 = {Xg} which means that C2 and C4 must also contain Xg. 



4 Results 

4.1 Messages 

def 

For any edge i — j of the JT, we define the following two sets: Sy = Sjj = Q D Cj the sep- 

def 

arator set; Ui^j = , / G path(cl(w) , j) } the upstream set {Ui^j U Vj-^i is a partition 

of We then define the message function M^j for all Xs u G ^5. , by: 

(X 5 ,J ^ l^.P (Z^ ^ |X L ^, ) (11) 

def 

with the convention that 1^ = 1 and with L^j = Uj^j fl Sy (£;->; ULy-j.,- is a partition of 
Sy), = \ Sy (V^j U V/_>j is a partition of % \ Sy). 
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Example 10. In the particular case of the precipitation HMM, for all i = 1 ... n — 1 we 
have Si >i+ i = {Si}, t/j-^+i = {S\:uY 1:i }, and Ui+i-n = {5i + i: B ,^+i:»}. We hence have 
Mi^ i+ i (Si) = F(Si,Y 1:i = yu) and M i+1 ^ (Si) = F(Y hi = y u \Si). We recognize the for- 
ward and backward quantities of Eq. (|2]). 

Example 11. For the pedigree BNT and the JT of Fig. [5]we obtain the following messages: 

• Mi_> 2 (X 3 : 4 ) =l^ :4 P(X3:4,^l:2),M 2 _,i(X 3 :4) = P(<%: 10 1*3:4); 

• M 2 ^ 3 (X 4l X 9 ) = 1 A P(Z 4 , ^1:3,5,7:8,10^9)^3^2(^4,^9) = 1^(X 9 ^ 6 \X 4 ); 

• M 2 ^4(^3,^9) = 1^ 9 P (^3,9,^1:2,4,6),M4^2(X 3 ,X 9 )=P(^,7:8,10|^3,9); 

. m 4 ^ 5 (X7,x 9 ) = i^ 9 P(x 7)9 ,<ri :6) 8), m 5 ^ 4 (x 1: x 9 ) = P(Ao|x 7 ,9); 

• M 4 ^ 6 (X 3 ,X 7 ) = l^¥(X 3 ^ h2 ,4,6,9:10\X 1 ),M 6 ^ 4 (X 3 ,X 1 ) = 1 #7 P(X 7 , ^5,8^3); 

• AWX 3 ,X 5 ) = 1 4 5 P(X 3! 5, ^1:2,4,6:7,9:10), M^ 6 (X 3 ,X 5 ) =F(<? S \X 3 , 5 ). 

Lemma 12. For all « 6 we introduce the potential K u (Xf & ( u \) = f (X u \X pa ^) and 
get: 

Mi^j(X Su )= £ EI K u(X H u))- (12) 

Proof. From the definition of the potential K u , it is first clear that 

II K "i x Hu)) = 1 ^^(Xu^ j \X HUi ^ j) \ u ^ j ). 



Moreover if u G fa(Ui-^j) \ U^j, the covering property ensure that u appears at least once 
the upstream side of i — > j. Moreover, since Ui-+j U Uj-*i = ^ is a partition, u also appears 
on the downstream side of i — > j. The running intersection property hence proves that 
u E Q fl Cj = Sjj. Since fa(£/^j) \ U^j C Sij we therefore can write: 

and the summation over Xy j ^ j immediately proves the lemma. □ 

Although it is not proved in the same way, one should not that this lemma corresponds 
exactly to Theorem 10.3 page 354 in Koller and Friedman] (|2"009). 
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Example 13. For the pedigree BNT, we obtain the following potentials: 

• Ki(Xi) = l A P(Zi); • K 6 (X 6 ) = 1 #6 P(Z 6 ); 

• K 2 (X 2 ) = 1^P(X 2 ); • K 7 (X 3 ,X S ,X 7 ) = l^F(X 7 \X 3 ,X 5 ); 

• K 3 (X U X 2 ,X 3 ) = l^P(X3|Z l5 Z 2 ); . K S (X 3 ,X 5 ,X S ) = 1^F(X S \X 3 ,X 5 ); 

• K 4 (X h X 2 ,X 4 ) = lg 4 F(X 4 \X u X 2 ); • K 9 (X 4 ,X 6 ,X 9 ) = l^F(X 9 \X 4 ,X 6 ); 

• K 5 (X 5 ) = 1 4 P(X 5 ); • K 10 (X 7 ,X 9 ,X W ) = l^ 10 P(Zio|X 7) X9). 

4.2 Marginal distributions 

Proposition 14. For any edge i — j of the JT, and for all Xs u G ^5. . we have: 

F(X Sip g) =M^ j (X Sij )M Mi (X s , j ). (13) 

Proof. Starting from 

with use the fact that Vi-^j U Vj-+i is a partition of \Sij and that j U t//-^ is a partition 
of ^ to write: 

p fe^)= Lin n k u (x Hu) ) 

Xv^j X V Hi ueUi-+ j ueU ;_>.; 

and since for all w G j it is clear that K u (-Xf a ( M )) does not depend on Xy ^. we finally 
obtain: 

V V 



which achieves the proof. □ 
Example 15. For the precipitation HMM, Proposition [14| gives for all i = 1 . . .n— 1: 

P(5i,Fi :n =yi :n ) =iWi_ M - + i(S i )^+i_ M <Si) = 
which is exactly Eq ([3]). 
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Proposition 16. For any j G and for all Xq } G 2$Cj we have: 

P (lb,,*) = (ZbJ M»j (X Si J ) (14) 

Mi) 

where n( j) = f {?', i — j is an edge of the JT} denotes the neighbor set of j, and where <J> y (.Xc-) 
ILeC) K u (Xfe( B )) . wim = f {w G Cy, cl(it) = j}, is the potential of Cy. 



Proof. The proof is very similar to the one of Proposition 14 The key is here to realize 
that: 1) U(- en (y)V^ is a partition of 'W \Cj; 2) C* U,- en Q-) U^j is a partition of . □ 

Example 17. In the particular case of the precipitation HMM, for i = 2 ... n — 1 we have 

Q = {Si- h St,Yi}, C* = {S u Yi}, n(i) = {/- + and hence F(Si- h S u Y 1:n =y hn ) = 
P(5i|5 f _i)F(^ = yilS^Mi-x^iiSi-^Mi+x^iiSi), which corresponds to Eq <g>. 

Example 18. For the pedigree BNT, the marginal distributions of all clusters are the fol- 
lowing: 

• F(X U X 2 ,X 3 ,X4,£) =K 1 (X 1 )K 2 (X 2 )K 3 (X h X 2 ,X 3 )K 4 (X 1 ,X2,X 4 )M 2 ^ 1 (X 3 ,X 4 y, 

• F(X 3 ,X 4l X 9 ^)=M^ 2 (X 3l X 4 )M 3 ^ 2 (X 4 ,X 9 )M 4 ^ 2 (X 3l X 9 y, 

• F(X 4 ,X 6 ,X 9 ,g) = K 6 (X 6 )K 9 (X 4 ,X 6 ,X 9 )M 2 ^ 3 (X 4 ,X 9 ); 

• F(X 3 ,X 1 ,X 9 ,^)=M 2 ^(X 3 ,X 9 )M 5 ^(X 7 ,X 9 )M 6 ^(X 3 ,X 7 y, 

• F(X 1 ,X 9 ,X l0 ,^)=K l0 (X 7 ,X 9 ,X l0 )M^ 5 (X 7 ,X 9 y, 

• F(X 3 ,X 5 ,X 7 ^)=K 5 (X 5 )K 7 (X 3 ,X 5 ,X 7 )M 4 ^ 6 (X 3 ,X 7 )M 7 ^ 6 (X 3 ,X 5 y 

• F(X 3 ,X 5 ,X 8 ,<?)=Ks(X 3 ,X 5 ,Xz)M 6 ^(X 3 ,X 5 ). 

Using the messages computed in Table [3] (see next section for more details on this 
computation), we get: 



= Ix 3 Mi^ 2 (X 3 , DD)M 2 ^i (X 3 , DD) = 0.0000480 + 0.0001 152 = 0.0001632; 
P(Xi = dD|<T) = 0.7647, and P(Zi = DD|<T) = 0.2353; 
F(X 2 = DD|<f) = 1.0000; 

F(X 3 = dD|<f ) = 0.2941, and F(X 3 = DD|<T) = 0.7059; 
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P(X 4 = 


DD|<T) = 


1.0000; 


F(X 5 = 


dD|<T) = 


0.9412, and P(X 5 = DD|<T) = 0.0588; 


F(X 6 = 


dd|<T) = 


0.5333, F(X 6 = dD|<T) = 0.4000, and P(X 6 


P(X 7 = 


dD|<T) = 


1.0000; 


P(X 8 = 


DD|0 = 


1.0000; 


P(Xi = 


dD|<T) = 


0.6778, and P(Zi = DD|<T) = 0.3333; 


P(Xi = 


= DD|<f) = 


= 1.0000. 



One should note that these marginal distributions only describe roughly the distribution 
F(X^\S > ). For example, if we consider the joint distribution of (X^,X^) (obtained by 
the product of messages M 6 ^ 7 and M 7 ^ 6 ) we get: P(X 3 = dD,X 5 = dD|<f) = 0.2353, 
P(X 3 = dD,Z 5 = DD|<T) = 0.0588, P(X 3 = DD,X 5 = dD|<T) = 0.7059, and P(X 3 = DD,X 5 = 
= 0.000 while (for example) P(X 3 = DD|<f) x F(X 5 = DD|<T) = 0.0415 ^ 0.000. 



4.3 Recursions 

Corollary 19. For all j — k edge of the JT, for all Xsj k G $>s- k we have: 

M Hk (x Sjk )= £ ®j(x Cj ) n M > •/(- y .v,.)- 

ien(j),i^k 



Xc i\ s JM 



Proof. Start with 



and apply Eq. ( [13] ) to the left-hand and Eq. ( [14] ) to the right-hand. 
Example 20. In the particular case of the precipitation HMM, we get: 



(15) 



□ 



for all / = 2. . .n- 1, M^ i+1 (S,) = Zs^ P(S;|Si-iTO = y^M^i (S { _i) which 
is exactly the forward recursion of Eq. ([5]); 

for all i = 1 . . . n - 2, M i+1 ^ (5 f ) = Zs i+1 = Vi+i |%i)M^ ; _! (S,) 

which is exactly the forward recursion of Eq. ([6]). 
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Since in that case the JT is in fact reduced to a simple sequence, messages in the forward 
and backward directions can be computed independently. This is however not true in the 
general case where a more subtle recursion algorithm is needed. 

Proposition 21 (inward-outward algorithm). If we choose a root re / for the JT, we call 
inward message any message orientated from leaves to the root, and outward message any 
message in the opposite direction. We define on i G / two recursive function: 

• inward(i'): for all j offspring of i do call inward(j), and compute Af 

• outward(?'): for all j offspring of i do compute M^j, and call outward(j'). 

Then all inward messages can be computed by calling inward(r), and then, the remaining 
outward messages by calling outward(r). 



Proof. See classical textbooks (Cowell et al. 1999 ; Jensen and Nielsen 2007 ; Koller and 



Friedman 2009) for a detailed proof. □ 



One should note that if the inward recursion only involve inward messages, the outward 
recursion involves both inward and outward messages. This means that unlike with the 
forward-backward recursion in HMM, the two recursions cannot be done independently. 
Another interesting remark is that thanks to Eq. ( 14), the recursion inward(r) is sufficient 
to obtain F(X Crl S) and hence also P(<f ) . 

Example 22. If we now come back to the precipitation HMM and if we root the JT in r = n, 
then inward(n) perform the standard forward recursion, and outward(n) perform the 
backward one. However, other rooting are possible. For example if we choose r = i E J* 
with i ^ 1 and i ^ n, then inward(z') allows to compute P(5/_i,5j,Fi :n = y\- n ) directly, 
the inward messages involved in the process being a mixture of forward and backward 
messages. 

Example 23. For the pedigree BNT with root r = 1, the inward recursion is: 

• M^ 6 (X 3 ,X 5 ) = Z Xs K 8 (X 3 ,X 5 ,Xs); 

• M 6 ^(X 3 ,X 1 )=^ X5 K 5 (X 5 )K 7 (X 3 ,X 5 ,X 1 )M 1 ^ 6 (X 3 ,X 5 ); 

• M 5 ^ 4 (Xj : X 9 ) =^10^10(^7,^9,^10); 

• M 4 ^ 2 (X 3 ,X 9 )=£ X7 M 5 ^ 4 (X 1 ,X 9 )M 6 ^ 4 (X 3 ,X 1 ); 

• ^3^2(^4,^9) = Ex 6 ^6(^6)^9(^4,^6,^9); 
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Table 3: Messages of the pedigree BNT. First part of the table corresponds to the inward 

messages computed using Cluster 1 as root. The second part of the table corresponds to 

the outward messages. 

X u Xj dd,dd dd,dD dd,DD dD,dd dD,dD dD,DD DD,dd DD,dD DD,DD 

M 7 ^ 6 (X 3 ,X 5 ) 0.0000 0.0000 0.0000 0.0000 0.2500 0.5000 0.0000 0.5000 1.0000 

M 6 ^ 4 (X 3 ,X 7 ) 0.0000 0.0000 0.0000 0.0200 0.0500 0.0000 0.0000 0.0800 0.0000 

M 5 ^ 4 (X 7 ,X 9 ) 0.0000 0.0000 0.0000 0.0000 0.2500 0.5000 0.0000 0.5000 1.0000 

M 4 ^ 2 (X 3 ,X 9 ) 0.0000 0.0000 0.0000 0.0000 0.0250 0.0500 0.0000 0.0400 0.0800 

M 3 ^ 2 (X 4 ,X 9 ) 0.8000 0.2000 0.0000 0.4000 0.5000 0.1000 0.0000 0.8000 0.2000 

M 2 ^i(X 3 ,X 4 ) 0.0000 0.0000 0.0000 0.0025 0.0088 0.0150 0.0040 0.0140 0.0240 

1000xMi^ 2 (X 3 ,X 4 ) 0.0000 0.0000 0.0000 0.0000 0.0000 3.2000 0.0000 0.0000 4.8000 

1000xM 2 ^ 3 (X 4 ,X 9 ) 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1360 0.2720 

1000xM 2 ^ 4 (X 3 ,X 9 ) 0.0000 0.0000 0.0000 0.0000 2.5600 0.6400 0.0000 3.8400 0.9600 

1000xM 4 ^ 5 (X 7 ,X 9 ) 0.0000 0.0512 0.0128 0.0000 0.4352 0.1088 0.0000 0.0000 0.0000 

1000xM 4 ^ 6 (X 3 ,X 7 ) 0.0000 0.0000 0.0000 0.0000 0.9600 1.9200 0.0000 1.4400 2.8800 

1000xM 6 ^ 7 (X 3 ,X 5 ) 0.0000 0.0000 0.0000 0.3072 0.1536 0.0192 0.9216 0.2304 0.0000 



• M 2 ^(X 3 ,X 4 ) = Y, X<) M 3 ^ 2 (X 4 ,X 9 )M 4 ^ 2 (X 3 ,X 9 ). 
The outward recursion is (inward messages are underlined): 

• Mi_> 2 (X 3 ,X0 = Ix u x 2 Ki(X 1 )K 2 (X 2 )K 3 (X h X 2 ,X 3 )K 4 (X h X 2 ,X 4 y, 



• M 2 . 


+3(^4,^9) 


= Lx 3 Mi- 


, 2 (X 3 ,X 4 )M 4 - 


^2(X 3 ,Xg); 


• M 2 - 


^ 4 (X 3 ,X 9 ) 


= Lx 4 M 1 _ 


, 2 (X 3 ,X 4 )M 3 ^ 2 (X 4 ,X 9 ); 


• M 4 _ 


^(Xj^Xg) 


= Lx 3 M 2 - 


, 4 (X 3 ,X 9 )M 6 _ 


-> 4 (X 3l X 7 ); 


• M 4 _ 


^6( X 3:X 7 ) 


= Lx 9 M 2 - 


, 4 (X 3 ,X 9 )M 5 _ 


-> 4 (X 7 ,Xg); 



• M 6 ^ 7 (X 3 ,X 5 ) = Z Xl K 5 (X 5 )K 7 (X 3 ,X 5 ,X 7 )M 4r + 6 (X 3 ,X 7 ). 
The results of these recursions are given in Table [3] 

4.4 Sampling 

Corollary 24. For all edge j — k of the JT, for all Xq G 3>Cj we have: 

p (x c .\x Sv s) = ^ (^nienuw^j fe,; 
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Proof. Immediate by dividing Eq. ([14]) by Eq. ( 13 ). 



□ 



Example 25. In the particular case of the precipitation HMM, we get: 



• for all i = 1 ...n — 2, 



F(S i+1 \Si,Y hn =y 1:n ) 



P(5 7+ i|5 ; )P(y i+ i=3;, +1 |5 i+ i)M ; , 
M i+1 ^i(Si) 



(Si-l) 



which is exactly Eq. ([7]); 



• for all i = 2 ... n — 1 , 



W{S i - l \S h Yv.n=yV.n) 



Sj-{jF{Yi = yi\Si)M i - i ^ i {S i - l ) 
M^ i+1 (Si) 



which is exactly Eq. ([8]). 



Both formulas allows to sample from F(S\ :n \Y\ :n = y\ M ) sequentially (either in the forward 
or backward direction). Like for the recursions in previous section, this is due to the par- 
ticular structure of the JT (a sequence) and a more subtle sampling algorithm is necessary 
in general. 

Proposition 26 (sampling). For any root r G % , a sample from P(X^|<f) is recursively 
obtained by calling inward(r) and then sample(r) with 

• sample(j): draw F(Xc i \Xs ip , ( .,£') and for all j offspring of i call sample(y') 

where pa(z) denotes the parent of i in the rooted JT and SV.paM = ® by convention. 

Proof. The proof is the same than for the inward recursion. □ 

Example 27. For the pedigree BNT, sampling from P(Xi : io|<^) is achieved through: 



• sample (X h X 2 ,X 3 ,X 4 ) fromP(Xi,X 2 ,X3,X 4 |0 




• sample X 9 from F(X 9 \X 3 ,X 4 , S) 



M 3 ^ 2 (X 4 ,X 9 )M 4 ^ 2 (X 3 ,X 9 ) 
M 2 ^(X 3 ,X 4 ) 



• sample X 6 from P (X 6 \X 4 , X 9 , <f ) 



K(,(X(,)K 9 (X 4: X( n X 9 ) _ 
M 3 ^ 2 (X 4 ,X 9 ) ; 



• sampled from F(X 7 \X 3 ,X 9 ,£) 



M 5 ^ 4 (X 7 ,X 9 )M^ 4 (X 3 ,X 7 )_ 
M 4 ^ 2 (X 3 ,X 9 ) 
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Table 4: Five samples drawn from P(Xi : io|<»). The marginal posterior distribution of each 
variable is also given. 



variable 


Xi 


x 2 


*3 


x 4 


*5 


x 6 


X 7 


*8 


x 9 


Xio 


sample 1 


dD 


DD 


dD 


DD 


DD 


dD 


dD 


DD 


DD 


DD 


sample 2 


dD 


DD 


DD 


DD 


dD 


dD 


dD 


DD 


DD 


DD 


sample 3 


DD 


DD 


DD 


DD 


dD 


dd 


dD 


DD 


dD 


DD 


sample 4 


dD 


DD 


dD 


DD 


DD 


dd 


dD 


DD 


dD 


DD 


sample 5 


dD 


DD 


dD 


DD 


dD 


dd 


dD 


DD 


dD 


DD 


P(X. = dd\£) 
P(X. = dD |<f ) 
P(X. =DD|<r) 


0.00 
0.76 
0.24 


0.00 
0.00 
1.00 


0.00 
0.29 
0.71 


0.00 
0.00 
1.00 


0.00 
0.94 
0.06 


0.53 
0.40 
0.07 


0.00 
1.00 
0.00 


0.00 
0.00 
1.00 


0.00 
0.67 
0.33 


0.00 
0.00 
1.00 



sample X 10 from P(X 10 |X 7 ,X 9 ,^) - ^°(X 7 ,X 9 ,X 10 ) 



sample X 5 from P(X 5 |X 3 ,X 7 , <f ) 
sample X 8 from P(X 8 |X 3 ,X 5 ,£) 



M 5 ^ 4 (X 7 ,X 9 ) ' 

jf 5 (X5)^7(X3,X5,X 7 )M 7 ^ 6 (X3,X 5 ) 

M 6 _, 4 (X 3 ,X 7 ) 

^8(X3,X 5 ,X 8 ) . 
M 7 ^ 6 (X 3 ,X 5 )' 



We can see on Table [4] five samples drawn from P(Xi : io|^) using these conditional prob- 
abilities. 

One should note that it also possible to sample from P(X^|<f ) for any "V C ^ in a 
slightly more efficient way by restraining the sampling recursion to a subtree of the JT. 



5 Discussion 



We have introduced here with Eq. ( fTl) an explicit definition of messages in BNTs. To the 
best of our knowledge, this surprisingly seems to be the first time. Indeed, when looking 
either in the founding papers and textbooks where exact BP was initially developed ( |Pearl[ 



1986] [T988| |Lauritzen and~S piegemaIterJ [T988| |Shafer and Shenoyl [T9901 |Jensen et al. 



1990a|bj ), or in the most recent work on the subject ( pensen and Nielsen] |2007[ [Roller and 



Friedman} |2009[ |Tarlow et al] |2010[ |Caetano and McAuley[ |2011[ ), messages are always 
defined implicitly through the recursive formula of Eq. ( [TB] ). This might be due to the 
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fact that the popular approximated BP algorithms (ex: loopy BP) are all based on similar 
recursive formulas. 

However, the explicit message-centered approach that we suggest here has several ad- 
vantages over the classical approach of exact BP Firstly, it follows the sketch of the theory 
of inference in HMMs allowing to introduce BNTs as a natural extension of these well- 
known models from definitions to proofs, with obvious pedagogical benefits. Secondly, it 
provides a compact and straightforward proof of all exact BP results (the only steps which 
require some work are Lemma [T2], Proposition [14} and Proposition 16). Finally, it extends 



a step further the parallelism pointed out by Smyth et al. (T997J) between Markov sequence 



related models (Markov chains, HMMs, Markov tree) and BNTs therefore opening new 
exciting possibilities for those who work with HMMs models and variants without having 
to refer to the general theory of exact BP in BNTs to prove the resulting formulas. 

For example, suppose we consider X\- M an homogeneous Markov chain with starting 
distribution /! and transition matrix it, and would like to sample from P(Xi :n |Xi = X n ). By 
introducing an appropriate BNT (left to the reader), we can easily establish that ¥(X\ \X\ — 
X n ) « niX^-^XuXi) and that P(X f |Xf_i = X n ) « niX^X^ 1 ^^) for 
all i = 2 . . . n — 1. Of course, this result can be obtained directly without introducing any 
BNT, but our message-centered approach provides without effort a complete sketch of 
the proof. This might prove itself very useful when working with sophisticated extension 
of Markov sequence related models (ex: HMMs with partially observed hidden states, 
complex dependencies, or multiple observations; evolutionary processes through Markov 
trees including loops, etc.). 

For further work, it would be interesting to extend our approach to more general propa- 
gation than the sum-product one we consider here. For example, max-product propagation 
can be easily considered by replacing sums by maximums in Eq. ( fl"2| ), thus giving the fol- 
lowing max-message definition: 

Mf^(X SiJ )^l^ jx max F (X L ^,X Vi ^\X L ^) (17) 
from which all max-product propagation results can be easily derived. 



Appendix 

A R source code for the precipitation HMM 

# generates the data 

pi=matrix(c(0. 7, 0.3,0. 1,0.9) ,ncol=2 ,byrow=T) ; 
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11=100; 

s=numeric (n) ; 
s[l]=2; 

for (i in 2:100) s [i] =which(rmultinom(l , size=l ,prob=pi [s [i-1] J )==1) ; 
lambda=c(3.0,0.5) ; 
x=rpois(n,lambda=lambda[s] ) ; 
plot(x) ; 
index=l :n; 

points (index [s==l] ,x[s==l] ,col="blue") ; 
points (index [s==2] ,x[s==2] ,col="red") ; 

# forward and backward recursions 
e=rbind(dpois(x, lambda [1] ) ,dpois(x, lambda [2] )) ; 
F=0*e; B=0*e; 

F[2,l]=e[2,l] ; 

for (i in 2:n) F [, i] =t (F [, i-1] °/„*°/„pi) *e [, i] ; 
B[,n]=l; 

for (i in seq(n-l,l,by=-l)) B [ , i] =pi°/„*°/„(e [, i+1] *B [, i+1] ) ; 

# marginal distribution 
marginal=B*F/sum(B[,l]*F[,l]) ; 

plot (marginal [1,] ,t='l' , col="blue" , lwd=2) ; 
points (marginal [2,] ,t=' 1 ' , col="red" , lwd=2) ; 
points(s==l,col="blue") ; 
points (s==2 , col="red" ) ; 

# sampling from P(S|X) 
sample=NULL; 

for (iter in 1:5) { 
ss=numeric(n) ; 
ss[l]=2; 

for (i in 2:100) ss [i] =which(rmultinom(l , size=l , 

prob=pi[ss[i-l] ,]/B[ss[i-l] , i-1] *e [, i] *B [, i] )==1) ; 
sample=rbind( sample , ss) ; 

} 
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B R source code for the pedigree BNT 



# define the model 
p=0.2 

Pf=c((l-p)-2,2*p*(l-p),p~2) ; 

Pnf =matrix(rep(NA,27) ,nrow=3) ; 

Pnf [1,1=0(1,0.5,0,0.5,0.25,0,0,0,0) ; 

Pnf [2,] =c (0,0. 5, 1,0. 5, 0.5, 0.5, 1,0. 5,0) ; 

Pnf [3 , ] =c (0 , , , , . 25 , . 5 , , . 5 , 1) ; 

pair=f unction(Xl , X2) 3* (Xl-1) +X2 ; 

Kl=function(Xl) Pf [XI] ; 

K2=function(X2) (X2==3) *Pf [X2] ; 

K3=function(Xl,X2,X3) Pnf [X3,pair(Xl ,X2)] ; 

K4=function(Xl,X2,X4) (X4==3)*Pnf [X4,pair (XI ,X2)] ; 

K5=function(X5) Pf [X5] ; 

K6=function(X6) Pf [X6] ; 

K7=function(X3,X5,X7) (X7!=3)*Pnf [X7,pair (X3,X5)] ; 
K8=function(X3,X5,X8) (X8==3)*Pnf [X8,pair (X3,X5)] ; 
K9=function(X4,X6,X9) Pnf [X9, pair (X4,X6)] ; 
K10=f unction (X7,X9,X10) (X10==3)*Pnf [X10,pair(X7,X9)] ; 

# inward 
M76=rep(0,9) ; 

for (X3 in 1:3) for (X5 in 1:3) for (X8 in 1:3) 

M76 [pair (X3 , X5) ] =M76 [pair (X3 , X5) ] +K8 (X3 , X5 , X8) ; 
M64=rep(0,9) ; 

for (X3 in 1:3) for (X7 in 1:3) for (X5 in 1:3) 

M64 [pair (X3 , X7) ] =M64 [pair (X3 , X7) ] +K5 (X5) *K7 (X3 , X5 , X7) *M76 [pair (X3 , X5) ] ; 
M54=rep(0,9) ; 

for (X7 in 1:3) for (X9 in 1:3) for (X10 in 1:3) 

M54 [pair (X7 , X9) ] =M54 [pair (X7 , X9) ] +K10 (X7 , X9 , X10) ; 
M42=rep(0,9) ; 

for (X3 in 1:3) for (X9 in 1:3) for (X7 in 1:3) 

M42 [pair (X3 , X9) ] =M42 [pair (X3 , X9) ] +M54 [pair (X7 , X9) ] *M64 [pair (X3 , X7) ] ; 
M32=rep(0,9) ; 

for (X4 in 1:3) for (X9 in 1:3) for (X6 in 1:3) 

M32 [pair (X4 , X9) ] =M32 [pair (X4 , X9) ] +K6 (X6) *K9 (X4 , X6 , X9) ; 
M21=rep(0,9) ; 
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for (X3 in 1:3) for (X4 in 1:3) for (X9 in 1:3) 

M21 [pair (X3 , X4) ] =M21 [pair (X3 , X4) ] +M32 [pair (X4 , X9) ] *M42 [pair (X3 , X9) ] ; 

# outward 
M12=rep(0,9) ; 

for (X3 in 1:3) for (X4 in 1:3) for (XI in 1:3) for (X2 in 1:3) 

M12[pair(X3,X4)]=M12[pair(X3,X4)]+Kl(Xl)*K2(X2)*K3(Xl,X2,X3)*K4(Xl,X2,X4) ; 
M23=rep(0,9) ; 

for (X4 in 1:3) for (X9 in 1:3) for (X3 in 1:3) 

M23 [pair (X4 , X9) ] =M23 [pair (X4 , X9) ] +M12 [pair (X3 , X4) ] *M42 [pair (X3 , X9) ] ; 
M24=rep(0,9) ; 

for (X3 in 1:3) for (X9 in 1:3) for (X4 in 1:3) 

M24 [pair (X3 , X9) ] =M24 [pair (X3 , X9) ] +M12 [pair (X3 , X4) ] *M32 [pair (X4 , X9) ] ; 
M45=rep(0,9) ; 

for (X7 in 1:3) for (X9 in 1:3) for (X3 in 1:3) 

M45 [pair (X7 , X9) ] =M45 [pair (X7 , X9) ] +M24 [pair (X3 , X9) ] *M64 [pair (X3 , X7) ] ; 
M46=rep(0,9) ; 

for (X3 in 1:3) for (X7 in 1:3) for (X9 in 1:3) 

M46 [pair (X3 , X7) ] =M46 [pair (X3 , X7) ] +M24 [pair (X3 , X9) ] *M54 [pair (X7 , X9) ] ; 
M67=rep(0,9) ; 

for (X3 in 1:3) for (X5 in 1:3) for (X7 in 1:3) 

M67 [pair (X3 , X5) ] =M67 [pair (X3 , X5) ] +K5 (X5) *K7 (X3 , X5 , X7) *M46 [pair (X3 , X7) ] ; 
pevidence=sum(M12*M21) ; 
pevidence=sum(M67*M76) ; 

print(rbind(M76,M64,M54,M42,M32,M21) ,digits=12) ; 
print (rbind(M12,M23,M24,M45,M46,M67)*1000,digits=12) ; 

# marginal distributions 
Pl=rep(0,3) ; 

for (XI in 1:3) for (X2 in 1:3) for (X3 in 1:3) for (X4 in 1:3) 

Pl[Xl]=Pl[Xl]+Kl(Xl)*K2(X2)*K3(Xl,X2,X3)*K4(Xl,X2,X4)*M21[pair(X3 ) X4)] ; 
P2=rep(0,3) ; 

for (XI in 1:3) for (X2 in 1:3) for (X3 in 1:3) for (X4 in 1:3) 

P2 [X2] =P2 [X2] +K1 (XI) *K2 (X2) *K3 (XI , X2 , X3) *K4 (XI , X2 , X4) *M21 [pair (X3 , X4) ] ; 
P3=rep(0,3) ; 

for (XI in 1:3) for (X2 in 1:3) for (X3 in 1:3) for (X4 in 1:3) 

P3 [X3] =P3 [X3] +K1 (XI) *K2 (X2) *K3 (XI , X2 , X3) *K4 (XI , X2 , X4) *M21 [pair (X3 , X4) ] ; 
P4=rep(0,3) ; 
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for (XI in 1:3) for (X2 in 1:3) for (X3 in 1:3) for (X4 in 1:3) 

P4 [X4] =P4 [X4] +K1 (XI) *K2 (X2) *K3 (XI , X2 , X3) *K4 (XI , X2 , X4) *M21 [pair (X3 , X4) ] ; 
P5=rep(0,3) ; 

for (X3 in 1:3) for (X5 in 1:3) 

P5 [X5] =P5 [X5] +M67 [pair (X3 , X5) ] *M76 [pair (X3 , X5) ] ; 
P6=rep(0,3) ; 

for (X4 in 1:3) for (X6 in 1:3) for (X9 in 1:3) 

P6 [X6] =P6 [X6] +K6 (X6) *K9 (X4 , X6 , X9) *M23 [pair (X4 , X9) ] ; 
P7=rep(0,3) ; 

for (X7 in 1:3) for (X9 in 1:3) 

P7 [X7] =P7 [X7] +M45 [pair (X7 , X9) ] *M54 [pair (X7 , X9) ] ; 
P8=rep(0,3) ; 

for (X3 in 1:3) for (X5 in 1:3) for (X8 in 1:3) 

P8 [X8] =P8 [X8] +K8 (X3 , X5 , X8) *M67 [pair (X3 , X5) ] ; 
P9=rep(0,3) ; 

for (X4 in 1:3) for (X6 in 1:3) for (X9 in 1:3) 

P9 [X9] =P9 [X9] +K6 (X6) *K9 (X4 , X6 , X9) *M23 [pair (X4 , X9) ] ; 
P10=rep(0,3) ; 

for (X7 in 1:3) for (X9 in 1:3) for (X10 in 1:3) 
P10 [X10] =P10 [X10] +K10 (X7 , X9 , X10) *M45 [pair (X7 , X9) ] ; 

# sampling 
P13=rep(0,9) ; 

for (XI in 1:3) for (X2 in 1:3) for (X3 in 1:3) for (X4 in 1:3) 
P13 [pair (XI , X3) ] =P13 [pair (XI , X3) ] +K1 (XI) *K2 (X2) * 

K3 (XI , X2 , X3) *K4 (XI , X2 , X4) *M21 [pair (X3 , X4) ] ; 
sample=matrix(rep(NA,5*10) ,nrow=5) ; 
for (iter in 1:5) { 

sample [iter , 2] =3 ; 

sample [iter , 4] =3 ; 

sample [iter , 8] =3 ; 

sample [iter , 10] =3 ; 

aux=which(rmultinom(l , size=l, prob=P13/pevidence)==l) ; 
sample [iter, l]=f loor( (aux-1) /3)+l ; 
sample [iter ,3] =aux-3*f loor ( (aux-1) /3) ; 
CP9=rep(NA,3) ; 
for (X9 in 1:3) { 

CP9 [X9] =M32 [pair (sample [iter , 4] , X9) ] *M42 [pair (sample [iter , 3] , X9) ] / 
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M21 [pair (sample [iter ,3] , sample [iter, 4] )] ; 

} 

sample [iter , 9] =which(rmultinom(l , size=l, prob=CP9)==l) ; 
CP6=rep(NA,3) ; 
for (X6 in 1:3) { 

CP6 [X6] =K6 (X6) *K9 (sample [iter , 4] , X6 , sample [iter , 9] ) / 
M32 [pair (sample [iter ,4] , sample [iter, 9] )] ; 

} 

sample [iter , 6] =which(rmultinom(l , size=l, prob=CP6)==l) ; 
CP7=rep(NA,3) ; 
for (X7 in 1:3) { 

CP7 [X7] =M54 [pair (X7 , sample [iter , 9] ) ] *M64 [pair (sample [iter , 3] , X7) ] / 
M42 [pair (sample [iter ,3] , sample [iter, 9] )] ; 

} 

sample [iter ,7] =which(rmultinom(l , size=l, prob=CP7)==l) ; 
CP5=rep(NA,3) ; 
for (X5 in 1:3) { 

CP5 [X5] =K5 (X5) *K7 (sample [iter , 3] , X5 , sample [iter , 7] ) * 
M76 [pair (sample [iter ,3] ,X5)] /M64 [pair (sample [iter ,3] , sample [iter ,7] )] ; 

} 

sample [iter , 5] =which(rmultinom(l , size=l, prob=CP5)==l) ; 
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